import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from matplotlib import rcParams
import plotly.express as px
import numpy as np
plt.rcParams["figure.figsize"]=[20,10]
d=pd.read_csv("C:\\Users\\Sagar\\Downloads\\fifa_eda_stats.csv")
d.head()
| Name | Age | Nationality | Overall | Potential | Club | Value | Wage | Preferred Foot | International Reputation | ... | Strength | LongShots | Aggression | Interceptions | Positioning | Vision | Penalties | Composure | StandingTackle | SlidingTackle | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | L. Messi | 31 | Argentina | 94 | 94 | FC Barcelona | €110.5M | €565K | Left | 5.0 | ... | 59.0 | 94.0 | 48.0 | 22.0 | 94.0 | 94.0 | 75.0 | 96.0 | 28.0 | 26.0 |
| 1 | Cristiano Ronaldo | 33 | Portugal | 94 | 94 | Juventus | €77M | €405K | Right | 5.0 | ... | 79.0 | 93.0 | 63.0 | 29.0 | 95.0 | 82.0 | 85.0 | 95.0 | 31.0 | 23.0 |
| 2 | Neymar Jr | 26 | Brazil | 92 | 93 | Paris Saint-Germain | €118.5M | €290K | Right | 5.0 | ... | 49.0 | 82.0 | 56.0 | 36.0 | 89.0 | 87.0 | 81.0 | 94.0 | 24.0 | 33.0 |
| 3 | De Gea | 27 | Spain | 91 | 93 | Manchester United | €72M | €260K | Right | 4.0 | ... | 64.0 | 12.0 | 38.0 | 30.0 | 12.0 | 68.0 | 40.0 | 68.0 | 21.0 | 13.0 |
| 4 | K. De Bruyne | 27 | Belgium | 91 | 92 | Manchester City | €102M | €355K | Right | 4.0 | ... | 75.0 | 91.0 | 76.0 | 61.0 | 87.0 | 94.0 | 79.0 | 88.0 | 58.0 | 51.0 |
5 rows × 44 columns
d.tail()
| Name | Age | Nationality | Overall | Potential | Club | Value | Wage | Preferred Foot | International Reputation | ... | Strength | LongShots | Aggression | Interceptions | Positioning | Vision | Penalties | Composure | StandingTackle | SlidingTackle | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 18202 | J. Lundstram | 19 | England | 47 | 65 | Crewe Alexandra | €60K | €1K | Right | 1.0 | ... | 47.0 | 38.0 | 46.0 | 46.0 | 39.0 | 52.0 | 43.0 | 45.0 | 48.0 | 47.0 |
| 18203 | N. Christoffersson | 19 | Sweden | 47 | 63 | Trelleborgs FF | €60K | €1K | Right | 1.0 | ... | 67.0 | 42.0 | 47.0 | 16.0 | 46.0 | 33.0 | 43.0 | 42.0 | 15.0 | 19.0 |
| 18204 | B. Worman | 16 | England | 47 | 67 | Cambridge United | €60K | €1K | Right | 1.0 | ... | 32.0 | 45.0 | 32.0 | 15.0 | 48.0 | 43.0 | 55.0 | 41.0 | 13.0 | 11.0 |
| 18205 | D. Walker-Rice | 17 | England | 47 | 66 | Tranmere Rovers | €60K | €1K | Right | 1.0 | ... | 48.0 | 34.0 | 33.0 | 22.0 | 44.0 | 47.0 | 50.0 | 46.0 | 25.0 | 27.0 |
| 18206 | G. Nugent | 16 | England | 46 | 66 | Tranmere Rovers | €60K | €1K | Right | 1.0 | ... | 60.0 | 32.0 | 56.0 | 42.0 | 34.0 | 49.0 | 33.0 | 43.0 | 43.0 | 50.0 |
5 rows × 44 columns
d.info()
<class 'pandas.core.frame.DataFrame'> Int64Index: 17918 entries, 0 to 18206 Data columns (total 44 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 Name 17918 non-null object 1 Age 17918 non-null int64 2 Nationality 17918 non-null object 3 Overall 17918 non-null int64 4 Potential 17918 non-null int64 5 Club 17918 non-null object 6 Value 17918 non-null object 7 Wage 17918 non-null object 8 Preferred Foot 17918 non-null object 9 International Reputation 17918 non-null float64 10 Weak Foot 17918 non-null float64 11 Work Rate 17918 non-null object 12 Body Type 17918 non-null object 13 Position 17918 non-null object 14 Height 17918 non-null object 15 Weight 17918 non-null object 16 Crossing 17918 non-null float64 17 Finishing 17918 non-null float64 18 HeadingAccuracy 17918 non-null float64 19 ShortPassing 17918 non-null float64 20 Volleys 17918 non-null float64 21 Dribbling 17918 non-null float64 22 Curve 17918 non-null float64 23 FKAccuracy 17918 non-null float64 24 LongPassing 17918 non-null float64 25 BallControl 17918 non-null float64 26 Acceleration 17918 non-null float64 27 SprintSpeed 17918 non-null float64 28 Agility 17918 non-null float64 29 Reactions 17918 non-null float64 30 Balance 17918 non-null float64 31 ShotPower 17918 non-null float64 32 Jumping 17918 non-null float64 33 Stamina 17918 non-null float64 34 Strength 17918 non-null float64 35 LongShots 17918 non-null float64 36 Aggression 17918 non-null float64 37 Interceptions 17918 non-null float64 38 Positioning 17918 non-null float64 39 Vision 17918 non-null float64 40 Penalties 17918 non-null float64 41 Composure 17918 non-null float64 42 StandingTackle 17918 non-null float64 43 SlidingTackle 17918 non-null float64 dtypes: float64(30), int64(3), object(11) memory usage: 6.2+ MB
d.describe()
| Age | Overall | Potential | International Reputation | Weak Foot | Crossing | Finishing | HeadingAccuracy | ShortPassing | Volleys | ... | Strength | LongShots | Aggression | Interceptions | Positioning | Vision | Penalties | Composure | StandingTackle | SlidingTackle | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| count | 17918.000000 | 17918.000000 | 17918.000000 | 17918.000000 | 17918.000000 | 17918.000000 | 17918.000000 | 17918.000000 | 17918.000000 | 17918.000000 | ... | 17918.000000 | 17918.000000 | 17918.000000 | 17918.000000 | 17918.000000 | 17918.000000 | 17918.000000 | 17918.000000 | 17918.000000 | 17918.000000 |
| mean | 25.105257 | 66.236801 | 71.329334 | 1.113908 | 2.947260 | 49.748856 | 45.581147 | 52.295290 | 58.713417 | 42.932135 | ... | 65.323697 | 47.130316 | 55.879060 | 46.690870 | 49.995758 | 53.448934 | 48.544480 | 58.655263 | 47.684005 | 45.643208 |
| std | 4.675372 | 6.929243 | 6.144098 | 0.395495 | 0.660106 | 18.354989 | 19.512533 | 17.367823 | 14.680340 | 17.688194 | ... | 12.552242 | 19.251517 | 17.354347 | 20.691841 | 19.521104 | 14.119193 | 15.691563 | 11.420965 | 21.647674 | 21.270735 |
| min | 16.000000 | 46.000000 | 48.000000 | 1.000000 | 1.000000 | 5.000000 | 2.000000 | 4.000000 | 7.000000 | 4.000000 | ... | 17.000000 | 3.000000 | 11.000000 | 3.000000 | 2.000000 | 10.000000 | 5.000000 | 3.000000 | 2.000000 | 3.000000 |
| 25% | 21.000000 | 62.000000 | 67.000000 | 1.000000 | 3.000000 | 38.000000 | 30.000000 | 44.000000 | 54.000000 | 30.000000 | ... | 58.000000 | 33.000000 | 44.000000 | 26.000000 | 39.000000 | 44.000000 | 39.000000 | 51.000000 | 27.000000 | 24.000000 |
| 50% | 25.000000 | 66.000000 | 71.000000 | 1.000000 | 3.000000 | 54.000000 | 49.000000 | 56.000000 | 62.000000 | 44.000000 | ... | 67.000000 | 51.000000 | 59.000000 | 52.000000 | 55.000000 | 55.000000 | 49.000000 | 60.000000 | 55.000000 | 52.000000 |
| 75% | 28.000000 | 71.000000 | 75.000000 | 1.000000 | 3.000000 | 64.000000 | 62.000000 | 64.000000 | 68.000000 | 57.000000 | ... | 74.000000 | 62.000000 | 69.000000 | 64.000000 | 64.000000 | 64.000000 | 60.000000 | 67.000000 | 66.000000 | 64.000000 |
| max | 45.000000 | 94.000000 | 95.000000 | 5.000000 | 5.000000 | 93.000000 | 95.000000 | 94.000000 | 93.000000 | 90.000000 | ... | 97.000000 | 94.000000 | 95.000000 | 92.000000 | 95.000000 | 94.000000 | 92.000000 | 96.000000 | 93.000000 | 91.000000 |
8 rows × 33 columns
#Data Cleaning
d.isnull().sum()
Name 0 Age 0 Nationality 0 Overall 0 Potential 0 Club 241 Value 0 Wage 0 Preferred Foot 48 International Reputation 48 Weak Foot 48 Work Rate 48 Body Type 48 Position 60 Height 48 Weight 48 Crossing 48 Finishing 48 HeadingAccuracy 48 ShortPassing 48 Volleys 48 Dribbling 48 Curve 48 FKAccuracy 48 LongPassing 48 BallControl 48 Acceleration 48 SprintSpeed 48 Agility 48 Reactions 48 Balance 48 ShotPower 48 Jumping 48 Stamina 48 Strength 48 LongShots 48 Aggression 48 Interceptions 48 Positioning 48 Vision 48 Penalties 48 Composure 48 StandingTackle 48 SlidingTackle 48 dtype: int64
d.isnull()
| Name | Age | Nationality | Overall | Potential | Club | Value | Wage | Preferred Foot | International Reputation | ... | Strength | LongShots | Aggression | Interceptions | Positioning | Vision | Penalties | Composure | StandingTackle | SlidingTackle | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | False | False | False | False | False | False | False | False | False | False | ... | False | False | False | False | False | False | False | False | False | False |
| 1 | False | False | False | False | False | False | False | False | False | False | ... | False | False | False | False | False | False | False | False | False | False |
| 2 | False | False | False | False | False | False | False | False | False | False | ... | False | False | False | False | False | False | False | False | False | False |
| 3 | False | False | False | False | False | False | False | False | False | False | ... | False | False | False | False | False | False | False | False | False | False |
| 4 | False | False | False | False | False | False | False | False | False | False | ... | False | False | False | False | False | False | False | False | False | False |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 18202 | False | False | False | False | False | False | False | False | False | False | ... | False | False | False | False | False | False | False | False | False | False |
| 18203 | False | False | False | False | False | False | False | False | False | False | ... | False | False | False | False | False | False | False | False | False | False |
| 18204 | False | False | False | False | False | False | False | False | False | False | ... | False | False | False | False | False | False | False | False | False | False |
| 18205 | False | False | False | False | False | False | False | False | False | False | ... | False | False | False | False | False | False | False | False | False | False |
| 18206 | False | False | False | False | False | False | False | False | False | False | ... | False | False | False | False | False | False | False | False | False | False |
18207 rows × 44 columns
#we are using inplace parameter to make it apply to the permanent version of our data frame
d.dropna(inplace=True)
d.isnull().sum()
Name 0 Age 0 Nationality 0 Overall 0 Potential 0 Club 0 Value 0 Wage 0 Preferred Foot 0 International Reputation 0 Weak Foot 0 Work Rate 0 Body Type 0 Position 0 Height 0 Weight 0 Crossing 0 Finishing 0 HeadingAccuracy 0 ShortPassing 0 Volleys 0 Dribbling 0 Curve 0 FKAccuracy 0 LongPassing 0 BallControl 0 Acceleration 0 SprintSpeed 0 Agility 0 Reactions 0 Balance 0 ShotPower 0 Jumping 0 Stamina 0 Strength 0 LongShots 0 Aggression 0 Interceptions 0 Positioning 0 Vision 0 Penalties 0 Composure 0 StandingTackle 0 SlidingTackle 0 dtype: int64
d.shape
(17918, 44)
#Assigning a variable to x(Age) to plot a univariate distribution along x-axis
sns.histplot(x=d["Age"],hue=d["Preferred Foot"],multiple="stack",palette="Spectral",edgecolor="k")
plt.title("Distribution of Age of the players Based on their Preferred foot",fontsize=18)
plt.xlabel("Age of players",fontsize=15)
plt.ylabel("Count",fontsize=15)
plt.xticks(fontsize=15)
plt.yticks(fontsize=15)
plt.show()
Hence it can be concluded that count of players with age 20-26 is maximum and their preferred foot is right.
d["Age"].mean()
25.122205745043114
d["Nationality"].value_counts().head(20).plot(kind="bar")
plt.title("Top 20 nations with maximum Players",fontsize=18)
plt.xlabel("Nations",fontsize=18)
plt.ylabel("Count",fontsize=18)
plt.xticks(fontsize=18)
plt.yticks(fontsize=15)
plt.show()
#data cleaning
fifa=d.copy()
def str2float(euros):
if euros[-1]=="M":
return float(euros[1:-1])*1000000
elif euros[-1]=="K":
return float(euros[1:-1])*1000
else:
return float(euros[1:])
fifa['Value']=fifa['Value'].apply(lambda x: str2float(x))
fifa['Wage']=fifa['Wage'].apply(lambda x: str2float(x))
fifa[["Name","Value","Wage"]]
| Name | Value | Wage | |
|---|---|---|---|
| 0 | L. Messi | 110500000.0 | 565000.0 |
| 1 | Cristiano Ronaldo | 77000000.0 | 405000.0 |
| 2 | Neymar Jr | 118500000.0 | 290000.0 |
| 3 | De Gea | 72000000.0 | 260000.0 |
| 4 | K. De Bruyne | 102000000.0 | 355000.0 |
| ... | ... | ... | ... |
| 18202 | J. Lundstram | 60000.0 | 1000.0 |
| 18203 | N. Christoffersson | 60000.0 | 1000.0 |
| 18204 | B. Worman | 60000.0 | 1000.0 |
| 18205 | D. Walker-Rice | 60000.0 | 1000.0 |
| 18206 | G. Nugent | 60000.0 | 1000.0 |
17918 rows × 3 columns
#Count of players based on their height
sns.countplot(x=d["Height"],edgecolor="k",palette="Blues")
sns.set_theme(style="darkgrid")
plt.xlabel("Height",fontsize=15,weight="bold")
plt.ylabel("Number of players",fontsize=15,weight="bold")
plt.xticks(fontsize=15)
plt.yticks(fontsize=15)
plt.title("Count of players based on their height",fontsize=18)
plt.show()
Comparing the best players
skills=[]
for i in d.columns:
skills.append(i)
skills
['Name', 'Age', 'Nationality', 'Overall', 'Potential', 'Club', 'Value', 'Wage', 'Preferred Foot', 'International Reputation', 'Weak Foot', 'Work Rate', 'Body Type', 'Position', 'Height', 'Weight', 'Crossing', 'Finishing', 'HeadingAccuracy', 'ShortPassing', 'Volleys', 'Dribbling', 'Curve', 'FKAccuracy', 'LongPassing', 'BallControl', 'Acceleration', 'SprintSpeed', 'Agility', 'Reactions', 'Balance', 'ShotPower', 'Jumping', 'Stamina', 'Strength', 'LongShots', 'Aggression', 'Interceptions', 'Positioning', 'Vision', 'Penalties', 'Composure', 'StandingTackle', 'SlidingTackle']
skill=[ 'Crossing',
'Finishing',
'HeadingAccuracy',
'ShortPassing',
'Volleys',
'Dribbling',
'Curve',
'FKAccuracy',
'LongPassing',
'BallControl',
'Acceleration',
'SprintSpeed',
'Agility',
'Reactions',
'Balance',
'ShotPower',
'Jumping',
'Stamina',
'Strength',
'LongShots',
'Aggression',
'Interceptions',
'Positioning',
'Vision',
'Penalties',
'Composure',
'StandingTackle',
'SlidingTackle']
#Based on these skills we are comparing the best players(Messi nad ronaldo)
messi=d.loc[d["Name"]=="L. Messi"]
messi=pd.DataFrame(messi,columns=skill)
ronaldo=d.loc[d["Name"]=="Cristiano Ronaldo"]
ronaldo=pd.DataFrame(ronaldo,columns=skill)
messi
| Crossing | Finishing | HeadingAccuracy | ShortPassing | Volleys | Dribbling | Curve | FKAccuracy | LongPassing | BallControl | ... | Strength | LongShots | Aggression | Interceptions | Positioning | Vision | Penalties | Composure | StandingTackle | SlidingTackle | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 84.0 | 95.0 | 70.0 | 90.0 | 86.0 | 97.0 | 93.0 | 94.0 | 87.0 | 96.0 | ... | 59.0 | 94.0 | 48.0 | 22.0 | 94.0 | 94.0 | 75.0 | 96.0 | 28.0 | 26.0 |
1 rows × 28 columns
ronaldo
| Crossing | Finishing | HeadingAccuracy | ShortPassing | Volleys | Dribbling | Curve | FKAccuracy | LongPassing | BallControl | ... | Strength | LongShots | Aggression | Interceptions | Positioning | Vision | Penalties | Composure | StandingTackle | SlidingTackle | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 84.0 | 94.0 | 89.0 | 81.0 | 87.0 | 88.0 | 81.0 | 76.0 | 77.0 | 94.0 | ... | 79.0 | 93.0 | 63.0 | 29.0 | 95.0 | 82.0 | 85.0 | 95.0 | 31.0 | 23.0 |
1 rows × 28 columns
sns.pointplot(data=messi,color="blue")
sns.pointplot(data=ronaldo,color="red")
plt.xticks(rotation=90,fontsize=14)
plt.yticks(fontsize=14)
plt.title("Messi Vs Ronaldo",fontsize=20)
plt.xlabel("Skills",fontsize=20)
plt.ylabel("Skills Value",fontsize=20)
plt.grid()
Top 5 nations with overall best players
t_nations=d.groupby(['Nationality'])["Overall"].max().sort_values(ascending=False).head(5)
#we are grouping 2 columns nationality and overall and taking the max value and sorting the first 5 by descending order
t_nations
Nationality Argentina 94 Portugal 94 Brazil 92 Croatia 91 Uruguay 91 Name: Overall, dtype: int64
Top 5 clubs with overall best players
t_clubs=d.groupby(['Club'])["Overall"].max().sort_values(ascending=False).head(5)
t_clubs
Club Juventus 94 FC Barcelona 94 Paris Saint-Germain 92 Chelsea 91 Manchester United 91 Name: Overall, dtype: int64
# Age distribution of players in Countries
countries_name=("Argentina","Portugal","Brazil","Croatia","Uruguay")
c=d.loc[d["Nationality"].isin(countries_name) & d['Age']]
sns.boxenplot(x="Nationality",y="Age",data=c)
plt.xticks(fontsize=15)
plt.yticks(fontsize=15)
plt.xlabel("Countries",fontsize=15)
plt.ylabel("Age",fontsize=15)
plt.title("Age Distribution of players in Countries",fontsize=15)
plt.grid()
Argentina is the country which keeps the players from all age groups ranging from(18-40). And Most of the players from Argentina are in the age group of 22-32.
#Age distribution of Players in Clubs
sns.set_theme(style="darkgrid")
club_n=("Juventus","FC Barcelona","Paris Saint-Germain","Chelsea","Manchester United")
club=d.loc[d["Club"].isin(club_n) & d["Age"]]
sns.boxplot(x="Club",y="Age",data=club,dodge=True,palette="turbo",showmeans=True)
plt.xticks(fontsize=15)
plt.yticks(fontsize=15)
plt.xlabel("Clubs",fontsize=18)
plt.ylabel("Age",fontsize=15)
plt.title("Age Distribution of players in Clubs",fontsize=18)
plt.grid()
Here the TOP 5 Clubs are taken and see Juventus their young players are tooo less their maximum players age is above 25 years. And in Paris club young players can be seen on a large number.
#Nation wise Players count And Average potential lambda is function
# we are grouping natinality and overall column it will consider all the rows of overall
#x will go to overall ratings colmn
best_avg_Overall=d.groupby("Nationality").apply(lambda x:np.average(x["Overall"])).reset_index(name="Overall Ratings")
best_avg_player=d.groupby("Nationality").apply(lambda x:x["Overall"].count()).reset_index(name="Player Counts")
best_avg_count=pd.merge(best_avg_Overall,best_avg_player,how="inner",left_on="Nationality",right_on="Nationality")
top=best_avg_count[best_avg_count["Player Counts"]>=200]
top.sort_values(by=["Overall Ratings","Player Counts"],ascending=False)
px.scatter(top,x="Overall Ratings",y="Player Counts",color="Player Counts",hover_data=["Nationality"])
Here the nationality England has the player count 1662 and overall rating of about 63.42
best_avg_count
| Nationality | Overall Ratings | Player Counts | |
|---|---|---|---|
| 0 | Afghanistan | 61.000000 | 4 |
| 1 | Albania | 65.925000 | 40 |
| 2 | Algeria | 70.633333 | 60 |
| 3 | Andorra | 62.000000 | 1 |
| 4 | Angola | 67.600000 | 15 |
| ... | ... | ... | ... |
| 159 | Uzbekistan | 67.500000 | 2 |
| 160 | Venezuela | 67.268657 | 67 |
| 161 | Wales | 64.139535 | 129 |
| 162 | Zambia | 65.222222 | 9 |
| 163 | Zimbabwe | 69.769231 | 13 |
164 rows × 3 columns
best_avg_player
| Nationality | Player Counts | |
|---|---|---|
| 0 | Afghanistan | 4 |
| 1 | Albania | 40 |
| 2 | Algeria | 60 |
| 3 | Andorra | 1 |
| 4 | Angola | 15 |
| ... | ... | ... |
| 159 | Uzbekistan | 2 |
| 160 | Venezuela | 67 |
| 161 | Wales | 129 |
| 162 | Zambia | 9 |
| 163 | Zimbabwe | 13 |
164 rows × 2 columns
sns.countplot(x=fifa["Value"],order=fifa.Value.value_counts().iloc[:5].index,palette="gist_heat")
<AxesSubplot:xlabel='Value', ylabel='count'>
sns.boxplot(x=d["Age"],y=d["Potential"],hue=d["Preferred Foot"],palette="ocean")
plt.xlabel("AGE OF THE PLAYERS",fontsize=15,weight="bold")
plt.ylabel("POTENTIAL",fontsize=15,weight="bold")
plt.xticks(fontsize=15)
plt.yticks(fontsize=15)
plt.title("POTENTIAL OF THE PLAYERS WITH RESPECT TO THEIR AGE",fontsize=17,weight="bold")
plt.show()
Players between 20-32 who have the preferred foot as Left has the heighest potential then that of right foot players.
sns.scatterplot(x=fifa["Overall"],y=fifa["Value"],color="r",s=100,edgecolor="k",alpha=0.4)
sns.set_theme(style="darkgrid")
plt.xticks(fontsize=15)
plt.yticks(fontsize=15)
plt.xlabel("Overall Ratings",fontsize=15,weight="bold")
plt.ylabel("Value",fontsize=15,weight="bold")
plt.title("Relationship Between Overall Ratings and Value of the players",fontsize=20)
plt.show()
It can be concluded from the above graph that as the Overall ratings of the Players increaseAs Their values also increases.
sns.set_style("whitegrid")
sns.set_color_codes()
sns.kdeplot(data=d,x="Potential",hue="Preferred Foot",multiple="stack",palette="seismic",edgecolor="k")
plt.xticks(fontsize=15)
plt.yticks(fontsize=15)
plt.xlabel("Potential",fontsize=15,weight="bold")
plt.title("KDE(Potential of the players based on thier Preferred Foot)",fontsize=18)
plt.show()
From the above visual,it can be observed that density of the potential of players is maximum at 70 . And players with preferred foot left has higher Potential than right ones.
sns.jointplot(data=fifa,x=fifa["Potential"],y=fifa["Age"],kind="reg",marker="+",marginal_ticks=True,
marginal_kws=dict(bins=25,fill=False),color='m')
<seaborn.axisgrid.JointGrid at 0x2da0622e190>
The Players within the age 22-30 have the highest Potential.
sns.stripplot(x=fifa["Work Rate"],y=fifa["Wage"],data=fifa,
jitter=False,s=20,marker="D",linewidth=1,alpha=0.2,palette="seismic",edgecolor="k")
<AxesSubplot:xlabel='Work Rate', ylabel='Wage'>
sns.pairplot(fifa[["Potential","Wage","Age","Value"]])
<seaborn.axisgrid.PairGrid at 0x1e8766e6fa0>
plt.bar(list(fifa["Nationality"].value_counts()[0:5].keys()),list(fifa["Nationality"].value_counts()[0:5]),color="lightgreen",
edgecolor="k")
plt.xticks(fontsize=18)
plt.yticks(fontsize=18)
plt.show()
#key->only category
From the above graph we conclude that most of the players belong to ENGLAND more than 1600.
plt.bar(list(fifa["Name"])[0:5],list(fifa["Wage"])[0:5],color="lightblue",edgecolor="k")
#key->only category
<BarContainer object of 5 artists>
L.Messi has the highest wage among all the players.
#weight vs driblling
plt.xlabel('Weight', fontsize=25)
plt.ylabel('Dribbling', fontsize=25)
plt.title('Weight vs Dribbling', fontsize = 25)
sns.barplot(x='Weight', y='Dribbling', data=d.sort_values('Weight'),palette="viridis",edgecolor="k")
plt.xticks(rotation=90,fontsize=16,weight="bold")
plt.yticks(weight="bold",fontsize=16)
plt.show()
From the above figure it can be concluded that as the weight goes on increasing the Dribbling skill of the players is decreasing and in very few players after 205lbs weight are good at dribbling.
sns.countplot(x = 'Work Rate', data = d, palette = 'hls',edgecolor="k")
plt.title('Different work rates of the Players Participating in the FIFA 2019', fontsize = 20)
plt.xlabel('Work rates associated with the players', fontsize = 16)
plt.ylabel('count of Players', fontsize = 16)
plt.show()
Players with work rate medium/medium has the maximum particition in FIFA 2019.
#Every Nations' Player and their overall scores
some_countries = ('England', 'Germany', 'Spain', 'Argentina', 'France', 'Brazil', 'Italy', 'Columbia') # defining a tuple consisting of country names
data_countries = d.loc[d['Nationality'].isin(some_countries) & d['Overall']] # extracting the overall data of the countries selected in the line above
data_countries.head()
| Name | Age | Nationality | Overall | Potential | Club | Value | Wage | Preferred Foot | International Reputation | ... | Strength | LongShots | Aggression | Interceptions | Positioning | Vision | Penalties | Composure | StandingTackle | SlidingTackle | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 3 | De Gea | 27 | Spain | 91 | 93 | Manchester United | €72M | €260K | Right | 4.0 | ... | 64.0 | 12.0 | 38.0 | 30.0 | 12.0 | 68.0 | 40.0 | 68.0 | 21.0 | 13.0 |
| 8 | Sergio Ramos | 32 | Spain | 91 | 91 | Real Madrid | €51M | €380K | Right | 4.0 | ... | 83.0 | 59.0 | 88.0 | 90.0 | 60.0 | 63.0 | 75.0 | 82.0 | 92.0 | 91.0 |
| 14 | N. Kanté | 27 | France | 89 | 90 | Chelsea | €63M | €225K | Right | 3.0 | ... | 76.0 | 69.0 | 90.0 | 92.0 | 71.0 | 79.0 | 54.0 | 85.0 | 91.0 | 85.0 |
| 15 | P. Dybala | 24 | Argentina | 89 | 94 | Juventus | €89M | €205K | Left | 3.0 | ... | 65.0 | 88.0 | 48.0 | 32.0 | 84.0 | 87.0 | 86.0 | 84.0 | 20.0 | 20.0 |
| 16 | H. Kane | 24 | England | 89 | 91 | Tottenham Hotspur | €83.5M | €205K | Right | 3.0 | ... | 84.0 | 85.0 | 76.0 | 35.0 | 93.0 | 80.0 | 90.0 | 89.0 | 36.0 | 38.0 |
5 rows × 44 columns
ax = sns.boxplot(x = data_countries['Nationality'], y = data_countries['Overall'], palette = 'spring') # creating a bargraph
ax.set_xlabel(xlabel = 'Countries', fontsize = 18)
ax.set_ylabel(ylabel = 'Overall Scores', fontsize = 18)
ax.set_title(label = 'Distribution of overall scores of players from different countries', fontsize = 20)
plt.xticks(fontsize=15)
plt.yticks(fontsize=15)
plt.show()
Brazil has the players with highest overall rating around more than 70 points
top_play=d[['Name','Overall',"Age",'Club']]
top_play.sort_values(by='Overall',ascending=False,inplace=True)
top_30_play=top_play[:100]
fig=px.scatter(top_30_play,x='Age',y='Overall',color='Age',size='Overall',hover_data=['Name','Club'],title='Top Football Players in the FIFA 19 game')
fig.show()
C:\Users\Sagar\AppData\Local\Temp\ipykernel_12284\1668211768.py:2: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
From the above graph we can conclude that the top Football Players from the FIFA 19 game are Messi and Ronaldo and their age lies between 30-35(based on Overall Ratings).
#Every Clubs' Player and their overall scores
some_clubs = ('CD Leganés', 'Southampton', 'RC Celta', 'Empoli', 'Fortuna Düsseldorf', 'Manchestar City',
'Tottenham Hotspur', 'FC Barcelona', 'Valencia CF', 'Chelsea', 'Real Madrid') # creating a tuple of club names
data_clubs = d.loc[d['Club'].isin(some_clubs) & d['Overall']] # extracting the overall data of the clubs selected in the line above
data_clubs.head()
| Name | Age | Nationality | Overall | Potential | Club | Value | Wage | Preferred Foot | International Reputation | ... | Strength | LongShots | Aggression | Interceptions | Positioning | Vision | Penalties | Composure | StandingTackle | SlidingTackle | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 5 | E. Hazard | 27 | Belgium | 91 | 91 | Chelsea | €93M | €340K | Right | 4.0 | ... | 66.0 | 80.0 | 54.0 | 41.0 | 87.0 | 89.0 | 86.0 | 91.0 | 27.0 | 22.0 |
| 6 | L. Modrić | 32 | Croatia | 91 | 91 | Real Madrid | €67M | €420K | Right | 4.0 | ... | 58.0 | 82.0 | 62.0 | 83.0 | 79.0 | 92.0 | 82.0 | 84.0 | 76.0 | 73.0 |
| 7 | L. Suárez | 31 | Uruguay | 91 | 91 | FC Barcelona | €80M | €455K | Right | 5.0 | ... | 83.0 | 85.0 | 87.0 | 41.0 | 92.0 | 84.0 | 85.0 | 85.0 | 45.0 | 38.0 |
| 8 | Sergio Ramos | 32 | Spain | 91 | 91 | Real Madrid | €51M | €380K | Right | 4.0 | ... | 83.0 | 59.0 | 88.0 | 90.0 | 60.0 | 63.0 | 75.0 | 82.0 | 92.0 | 91.0 |
| 14 | N. Kanté | 27 | France | 89 | 90 | Chelsea | €63M | €225K | Right | 3.0 | ... | 76.0 | 69.0 | 90.0 | 92.0 | 71.0 | 79.0 | 54.0 | 85.0 | 91.0 | 85.0 |
5 rows × 44 columns
ax = sns.boxplot(x = data_clubs['Club'], y = data_clubs['Overall'], palette = 'inferno') # creating a boxplot
ax.set_xlabel(xlabel = 'Some Popular Clubs', fontsize = 15)
ax.set_ylabel(ylabel = 'Overall Ratings', fontsize = 15)
ax.set_title(label = 'Distribution of Overall Ratings in Different popular Clubs', fontsize = 20)
plt.xticks(rotation = 45,fontsize=16)
plt.yticks(fontsize=16)
plt.show()
#The club Real Madrid has highest Overall ratings than other Popular clubs.
sns.histplot(fifa["BallControl"],color="crimson",linewidth=2)
plt.xticks(fontsize=18)
plt.xlabel("Ball control of players")
plt.ylabel("Number of players")
plt.show()
ax=sns.heatmap(fifa.corr(),annot=True)
ax.set(xlabel=" ",ylabel=" ")
ax.xaxis.tick_top()
plt.xticks(rotation=90,fontsize=18)
plt.yticks(fontsize=15)
plt.show()
sns.histplot(fifa,x=fifa["International Reputation"],y=fifa["Overall"],bins=10,discrete=(True,False),
log_scale=(False,True),cbar=True,edgecolor="k")
plt.xticks(fontsize=15)
plt.yticks(fontsize=18)
plt.xlabel("International Reputation",fontsize=18)
plt.ylabel("Overall Ratings",fontsize=18)
plt.show()
#Teams with Only one International reputation has the highest Overall Ratings.
fifa.hist(bins=50,color="c",figsize=(40,30),edgecolor="k")
plt.show()
sns.FacetGrid(fifa,hue="Position",height=4).map(plt.bar,"Preferred Foot","International Reputation").add_legend()
plt.show()
#International players with preferred foot as right has more opportunities than left foot ones.
sns.histplot(x="Position",data=d,hue="Position",palette="magma")
plt.xticks(fontsize=18,rotation=90)
plt.yticks(fontsize=18)
plt.show()
# players having the position of striker has the Maximum count than other Positions.
x1=fifa["Position"].value_counts().head(5)
print(x1)
ST 2130 GK 1992 CB 1754 CM 1377 LB 1305 Name: Position, dtype: int64
label=x1.index
explode=[0,0,0,0,0.2]
color=sns.color_palette("Pastel1")
plt.pie(x1,labels=label,data=fifa,autopct="%0.1f%%",explode=explode,colors=color,shadow=True,startangle=0,
wedgeprops={"linewidth":1,"edgecolor":"k"})
plt.figure(figsize=(20,6))
plt.axis("equal")
plt.show()
x3=fifa["Work Rate"].value_counts().head(5)
print(x3)
Medium/ Medium 9685 High/ Medium 3131 Medium/ High 1660 High/ High 1007 Medium/ Low 840 Name: Work Rate, dtype: int64
l=x3.index
explode=[0,0,0.1,0,0]
color=sns.color_palette("Pastel2_r")
plt.pie(x3,labels=l,data=fifa,autopct="%1.3f%%",explode=explode,colors=color,shadow=True,startangle=0,
wedgeprops={"linewidth":1,"edgecolor":"k"})
plt.figure(figsize=(20,6))
plt.axis("equal")
plt.show()
sns.stripplot(x="Preferred Foot",y="Overall",data=d,size=9,color="lightgreen",marker="^",
edgecolor='k',alpha=0.9,linewidth=0.2)
plt.xticks(fontsize=14)
plt.yticks(fontsize=14)
plt.xlabel("Preferred Foot",fontsize=15,weight="bold")
plt.ylabel("Overall Rating",fontsize=15,weight="bold")
plt.title("Preferred Footwise Ratings of Players",fontsize=18)
plt.show()
#Players preferring the right foot has More Overall ratings than that of right one
sns.lineplot(x=fifa["Age"],y=fifa["Stamina"],hue=fifa["Preferred Foot"],palette="mako_r")
sns.set_theme(style="darkgrid")
plt.xlabel("Age",fontsize=15,weight="bold")
plt.ylabel("Stamina",fontsize=15,weight="bold")
plt.xticks(fontsize=14)
plt.yticks(fontsize=14)
plt.title("Realationship Between Age and Stamina",fontsize=18)
plt.show()
IN the above Figure,the relationship of Age and stamina of player was visualized and concluded that players at the age of 25-30 have the highest Stamina level(70),where players preferring the left foot has higher stamina then those preferring the right one.
sns.pointplot(x=d["Position"],y=d["Age"],errorbar=("pi",100),capsize=0.4,join=False,palette="rocket")
plt.xticks(fontsize=14)
plt.yticks(fontsize=14)
plt.xlabel("Position",fontsize=15,weight="bold")
plt.ylabel("Age",fontsize=15,weight="bold")
plt.title("Position and Age of Players",fontsize=18)
plt.show()
sns.violinplot(x=d["Preferred Foot"],y=d["Jumping"],palette="cividis")
plt.xticks(fontsize=14)
plt.yticks(fontsize=14)
plt.xlabel("Preferred Foot",fontsize=15,weight="bold")
plt.ylabel("Jumping",fontsize=15,weight="bold")
plt.title("Preferred Foot and Jumping of players",fontsize=18)
plt.show()
Here from the above graph we can conclude that Players Preferred with Right Foot have slightly good jump than left foot players.
sns.barplot(x="Body Type",y="Overall",data=d,palette="coolwarm",edgecolor="k")
plt.xticks(fontsize=14)
plt.yticks(fontsize=14)
plt.xlabel("Body Type",fontsize=15,weight="bold")
plt.ylabel("Overall Ratings",fontsize=15,weight="bold")
plt.title("Body Type and Overall ratings of players",fontsize=18)
plt.show()
#Body type column
fifa["Body Type"].value_counts()
Normal 10436 Lean 6351 Stocky 1124 Messi 1 C. Ronaldo 1 Neymar 1 Courtois 1 PLAYER_BODY_TYPE_25 1 Shaqiri 1 Akinfenwa 1 Name: Body Type, dtype: int64
sns.barplot(x="Body Type",y="Overall",data=fifa,palette="cool",edgecolor="k")
plt.xticks(fontsize=14)
plt.yticks(fontsize=14)
plt.xlabel("Body Type",fontsize=15,weight="bold")
plt.ylabel("Overall Ratings",fontsize=15,weight="bold")
plt.title("Body Type and Overall ratings of players",fontsize=18)
plt.show()
Conclusion:- • Players Preferred with Right Foot have slightly good jump than left foot players. • The relationship of Age and stamina of player was visualized and concluded that players at the age of 25-30 have the highest Stamina (70), where players preferring the left foot has higher stamina then those preferring the right one. • Players preferring the right foot has More Overall ratings than that of right one. • Players having the position of striker has the Maximum count than other Positions. • International players with preferred foot as right have more opportunities than left foot ones. • Teams with Only one International reputation has the highest Overall Ratings. • The club Real Madrid has highest Overall ratings than other Popular clubs. • the top Football Players from the FIFA 19 game are Messi and Ronaldo and their age lies between 30-35(based on Overall Ratings). • Brazil has the players with highest overall rating around more than 70 points. • Players with work rate medium/medium has the maximum participation in FIFA 2019. • As the weight goes on increasing the Dribbling skill of the players is decreasing and in very few players after 205lbs weight are good at dribbling. • L.Messi has the highest wage among all the players. • From the above graph we conclude that most of the players belong to ENGLAND more than 1600. • The Players within the age 22-30 have the highest Potential. • density of the potential of players is maximum at 70 .And players with preferred foot left has higher Potential than right ones. • As the Overall ratings of the Players increase . Their values also increases. • Players between 20-32 who have the preferred foot as Left has the highest potential then that of right foot players. • Here the nationality England has the player count 1662 and overall rating of about 63.42. • Here the TOP 5 Clubs are taken and see Juventus their young players are tooo less their maximum players age is above 25 years. And in Paris club young players can be seen on a large number. • Argentina is the country which keeps the players from all age groups ranging from (18-40).And Most of the players from Argentina are in the age group of 22-32. • Messi is winning since skill Values of messi are greater than that of ronaldo here. • Players of height 6'0 foot have the highest count. Around 2700 have height of 6'0 foot. • count of players with age 20-26 is maximum and their preferred foot is right.